Search CORE

14,414 research outputs found

A Question of Empowerment: Information Technology and Civic Engagement in New Haven, Connecticut

Author: Peter Dobkin Hall
Publication venue: Hauser Center for Nonprofit Organizations
Publication date: 11/11/2005
Field of study

Extravagant claims have been made for the capacity of IT (information technology) to empower citizens and to enhance the capacity of civic organizations. This study of IT use by organizations and agencies in New Haven, Connecticut, 1998-2004, tests these claims, finding that the use of IT by nonprofits is selective, tending to serve agencies patronized by community elites rather than populations in need. In addition, the study finds that single interest groups are far more effective in using IT than more diverse civic and neighborhood groups.This publication is Hauser Center Working Paper No. 30. The Hauser Center Working Paper Series was launched during the summer of 2000. The Series enables the Hauser Center to share with a broad audience important works-in-progress written by Hauser Center scholars and researchers

IssueLab

Strong approximations of level exceedences related to multiple hypothesis testing

Author: Hall Peter
Wang Qiying
Publication venue: 'Bernoulli Society for Mathematical Statistics and Probability'
Publication date: 08/10/2010
Field of study

Particularly in genomics, but also in other fields, it has become commonplace to undertake highly multiple Student's

t

-tests based on relatively small sample sizes. The literature on this topic is continually expanding, but the main approaches used to control the family-wise error rate and false discovery rate are still based on the assumption that the tests are independent. The independence condition is known to be false at the level of the joint distributions of the test statistics, but that does not necessarily mean, for the small significance levels involved in highly multiple hypothesis testing, that the assumption leads to major errors. In this paper, we give conditions under which the assumption of independence is valid. Specifically, we derive a strong approximation that closely links the level exceedences of a dependent ``studentized process'' to those of a process of independent random variables. Via this connection, it can be seen that in high-dimensional, low sample-size cases, provided the sample size diverges faster than the logarithm of the number of tests, the assumption of independent

t

-tests is often justified.Comment: Published in at http://dx.doi.org/10.3150/09-BEJ220 the Bernoulli (http://isi.cbs.nl/bernoulli/) by the International Statistical Institute/Bernoulli Society (http://isi.cbs.nl/BS/bshome.htm

arXiv.org e-Print Archive

Crossref

Modeling the variability of rankings

Author: Hall Peter
Miller Hugh
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 01/01/2010
Field of study

For better or for worse, rankings of institutions, such as universities, schools and hospitals, play an important role today in conveying information about relative performance. They inform policy decisions and budgets, and are often reported in the media. While overall rankings can vary markedly over relatively short time periods, it is not unusual to find that the ranks of a small number of "highly performing" institutions remain fixed, even when the data on which the rankings are based are extensively revised, and even when a large number of new institutions are added to the competition. In the present paper, we endeavor to model this phenomenon. In particular, we interpret as a random variable the value of the attribute on which the ranking should ideally be based. More precisely, if

p

items are to be ranked then the true, but unobserved, attributes are taken to be values of

p

independent and identically distributed variates. However, each attribute value is observed only with noise, and via a sample of size roughly equal to

n

, say. These noisy approximations to the true attributes are the quantities that are actually ranked. We show that, if the distribution of the true attributes is light-tailed (e.g., normal or exponential) then the number of institutions whose ranking is correct, even after recalculation using new data and even after many new institutions are added, is essentially fixed. Formally,

p

is taken to be of order

n^C

for any fixed

C>0

, and the number of institutions whose ranking is reliable depends very little on

p

.Comment: Published in at http://dx.doi.org/10.1214/10-AOS794 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

CiteSeerX

Crossref

Nonparametric estimation of mean-squared prediction error in nested-error regression models

Author: Hall Peter
Maiti Tapabrata
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 24/02/2016
Field of study

Nested-error regression models are widely used for analyzing clustered data. For example, they are often applied to two-stage sample surveys, and in biology and econometrics. Prediction is usually the main goal of such analyses, and mean-squared prediction error is the main way in which prediction performance is measured. In this paper we suggest a new approach to estimating mean-squared prediction error. We introduce a matched-moment, double-bootstrap algorithm, enabling the notorious underestimation of the naive mean-squared error estimator to be substantially reduced. Our approach does not require specific assumptions about the distributions of errors. Additionally, it is simple and easy to apply. This is achieved through using Monte Carlo simulation to implicitly develop formulae which, in a more conventional approach, would be derived laboriously by mathematical arguments.Supported in part by NSF Grant SES-03-18184

The Australian National University

Nonparametric estimation of a point-spread function in multivariate problems

Author: Hall Peter
Qiu Peihua
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 18/10/2007
Field of study

The removal of blur from a signal, in the presence of noise, is readily accomplished if the blur can be described in precise mathematical terms. However, there is growing interest in problems where the extent of blur is known only approximately, for example in terms of a blur function which depends on unknown parameters that must be computed from data. More challenging still is the case where no parametric assumptions are made about the blur function. There has been a limited amount of work in this setting, but it invariably relies on iterative methods, sometimes under assumptions that are mathematically convenient but physically unrealistic (e.g., that the operator defined by the blur function has an integrable inverse). In this paper we suggest a direct, noniterative approach to nonparametric, blind restoration of a signal. Our method is based on a new, ridge-based method for deconvolution, and requires only mild restrictions on the blur function. We show that the convergence rate of the method is close to optimal, from some viewpoints, and demonstrate its practical performance by applying it to real images.Comment: Published in at http://dx.doi.org/10.1214/009053606000001442 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

CiteSeerX

Crossref

The Australian National University

Assessing extrema of empirical principal component functions

Author: Hall Peter
Vial Céline
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 01/01/2006
Field of study

The difficulties of estimating and representing the distributions of functional data mean that principal component methods play a substantially greater role in functional data analysis than in more conventional finite-dimensional settings. Local maxima and minima in principal component functions are of direct importance; they indicate places in the domain of a random function where influence on the function value tends to be relatively strong but of opposite sign. We explore statistical properties of the relationship between extrema of empirical principal component functions, and their counterparts for the true principal component functions. It is shown that empirical principal component funcions have relatively little trouble capturing conventional extrema, but can experience difficulty distinguishing a ``shoulder'' in a curve from a small bump. For example, when the true principal component function has a shoulder, the probability that the empirical principal component function has instead a bump is approximately equal to 1/2. We suggest and describe the performance of bootstrap methods for assessing the strength of extrema. It is shown that the subsample bootstrap is more effective than the standard bootstrap in this regard. A ``bootstrap likelihood'' is proposed for measuring extremum strength. Exploratory numerical methods are suggested.Comment: Published at http://dx.doi.org/10.1214/009053606000000371 in the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

CiteSeerX

Crossref

The Australian National University

Hal-Diderot

Robustness of multiple testing procedures against dependence

Author: Clarke Sandy
Hall Peter
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 01/01/2009
Field of study

An important aspect of multiple hypothesis testing is controlling the significance level, or the level of Type I error. When the test statistics are not independent it can be particularly challenging to deal with this problem, without resorting to very conservative procedures. In this paper we show that, in the context of contemporary multiple testing problems, where the number of tests is often very large, the difficulties caused by dependence are less serious than in classical cases. This is particularly true when the null distributions of test statistics are relatively light-tailed, for example, when they can be based on Normal or Student's

t

approximations. There, if the test statistics can fairly be viewed as being generated by a linear process, an analysis founded on the incorrect assumption of independence is asymptotically correct as the number of hypotheses diverges. In particular, the point process representing the null distribution of the indices at which statistically significant test results occur is approximately Poisson, just as in the case of independence. The Poisson process also has the same mean as in the independence case, and of course exhibits no clustering of false discoveries. However, this result can fail if the null distributions are particularly heavy-tailed. There clusters of statistically significant results can occur, even when the null hypothesis is correct. We give an intuitive explanation for these disparate properties in light- and heavy-tailed cases, and provide rigorous theory underpinning the intuition.Comment: Published in at http://dx.doi.org/10.1214/07-AOS557 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

CiteSeerX

Crossref

Nonparametric regression with homogeneous group testing data

Author: Delaigle Aurore
Hall Peter
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 28/05/2012
Field of study

We introduce new nonparametric predictors for homogeneous pooled data in the context of group testing for rare abnormalities and show that they achieve optimal rates of convergence. In particular, when the level of pooling is moderate, then despite the cost savings, the method enjoys the same convergence rate as in the case of no pooling. In the setting of "over-pooling" the convergence rate differs from that of an optimal estimator by no more than a logarithmic factor. Our approach improves on the random-pooling nonparametric predictor, which is currently the only nonparametric method available, unless there is no pooling, in which case the two approaches are identical.Comment: Published in at http://dx.doi.org/10.1214/11-AOS952 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

Crossref